Search results for "dist:Lingua-Align Lingua Treebank"
Lingua::Align::Corpus::Treebank - Factory class for reading treebanks
Factory class of modules for reading treebanks in different formats. The default format is the Penn Treebank format. Other supported formats are the format produced by the Berkeley parser, the Stanford parser (including typed dependencies), TigerXML ...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus::Treebank::Penn - Read the Penn Treebank format
EXPORT...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus::Treebank::TigerXML - Read the TigerXML format
EXPORT...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus::Treebank::Stanford - Read output from the Stanford parser
Module to read treebanks in Penn Treebank format including dependency relations produced by the Stanford parser. Note: Adding dependency relations to the phrase-structure trees is still a bit buggy. EXPORT...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus::Treebank::Berkeley - Read the output of the Berkeley parser
EXPORT...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus::Treebank::AlpinoXML - Read Alpino XML
EXPORT...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
convert_treebank - convert a treebank from one format to another
This script allows you to convert a treebank to another format. The converted treebank is printed to STDOUT. Currently the following formats are supported: AlpinoXML (alpino) The XML format used by the Dutch dependency parser Alpino. Use the option [...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
treebank2moses - convert treebanks to Moses/GIZA++ format (plain text)
TIEDEMANN/Lingua-Align-0.04
-
10 Dec 2012 18:31:24 UTC
Lingua::Align - Perl modules for the alignment of parallel corpora
Lingua::Align contains modules for automatic tree alignment based on discriminative classification and alignment inference. More details about the tree aligner can be found in Lingua::Align::Trees. The following gives a general overview and motivatio...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Corpus - reading corpus data
Read corpus data in various formats. Default format = plain text, 1 sentence per line. For other types (parsed corpora etc): Use the "-type" flag....
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
Lingua::Align::Features - Feature extraction for tree alignment
Extract features from a pair of nodes from two given syntactic trees (source and target language). The trees should be complex hash structures as produced by Lingua::Align::Corpus::Treebank::TigerXML. The returned features are given as simple key-val...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
coocfreq - count co-occurrence frequencies for arbitrary features of nodes in a parallel treebank
This script counts frequencies and co-occurrence frequencies of source and target language features. It runs through the sentence aligned treebank and combines all node pairs. Note that co-occurrence frequencies in a sentence are " max( srcfreq(srcfe...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
sta2moses - convert from Stockholm Tree Aligner format to Moses/GIZA++ (plain text)
This script reads through a parallel treebank using the tree alignment file (alignments.xml) and produces sentence aligned plain text files (to be used with Moses/Giza++). The corpus will be stored in alignments.src and alignments.trg....
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
treealign - training tree alignment classifiers and aligning syntactic trees
This script allows you to train a tree alignment model and to apply them to parallel treebanks. Tree alignment is based on local binary classification and rich feature sets. Currently, training data has to be in Stockholm Tree Aligner format. The out...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
convert_bitext - a script for converting bitexts
Convert bitexts from one format to another. There are several formats supported by Lingua::Align. Check Lingua::Align::Corpus, Lingua::Align::Corpus::Treebank for more information....
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC
treealigneval - a script for computing precision and recall scores for tree aligmnent
Both gold-standard-file and tree-alignment-file should be in Stockholm Tree Aligner Format. Here is an example: <?xml version="1.0" ?> <treealign> <head> <alignment-metadata> <date>Tue May 4 16:23:04 2010</date> <author>Lingua-Align</author> </alignm...
TIEDEMANN/Lingua-Align-0.04 - 10 Dec 2012 18:31:24 UTC